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Gene for increased somatic recombination 
TECHNICAL FIELD 

The present invention relates to DNA that encodes proteins that control somatic 
recombination, in particular in plants. 

BACKGROUND 

Cells of all organisms have evolved a series of DNA repair pathways that counteract the 
deleterious effects of DNA damage and are triggered by intricate signal cascades. 
Homologous recombination in plants stabilizes the genome by repairing damaged 
chromosomes simultaneously generating genetic variability through the creation of new 
genes and new genetic linkages. Repair of DNA damage by recombination is particularly 
significant for cells under exogenous and endogenous genotoxic stress because of its 
potential to remove a wide range of DNA lesions. The current understanding of genetic and 
molecular components underlying meiotic and somatic recombination and DNA repair in 
plants is limited. To be able to modify or improve DNA repair using gene technology it is 
necessary to identify key proteins involved in said pathways or cascades. 

The precise manipulation of the genome of higher plants is still a major challenge for plant 
genetic engineering. Some advances have been made recently for the creation of point 
mutations at predetermined positions by chimeric RNA/DNA oligonucleotides (Beetham et al. 
1999, Hohn & Puchta 1999, Zhu et al. 1999, Kipp et al. 2000, Zhu et al. 2000). However, the 
targeted insertion of longer stretches of DNA sequence at any desired location ("knock-in") 
or the replacement of predetermined plant genomic sequences by heterologous DNA 
("knock-out) via homologous recombination is at present not possible as a routine technique 
(Mengiste & Paszkowski 1999, Puchta 2002). 

Few reports have appeared in the literature that describe successful "gene targeting" in 
higher plants (Paszkowski et al. 1988, Lee et al. 1990, Offringa et al. 1990, Miao & Lam 
1995, Kempin et al. 1997, Hanin et al. 2001), but the reported absolute numbers and relative 
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frequencies of the desired events were very low. Indeed, the main problem for "gene 
targeting" experiments is the low frequency of the desired homologous recombination events 
- 1CT 3 to 10~ 5 (Hohn & Puchta 1999, Mengiste & Paszkowski 1999) - relative to illegitimate 
recombination/integration events. 

Various attempts of increasing the low relative frequency of targeted homologous 
recombination events, by improved selection schemes ("positive-negative selection") or by 
providing extended regions of sequence homology, were not successful (Thykjaer et al. 
1997, Gallego et al. 1999). One promising strategy to facilitate gene targeting in higher 
plants would be to shift the balance between illegitimate and homologous recombination 
events towards the latter, by facilitating homologous recombination events in plants by 
genetic manipulation (Gherbi et al 2001). 

One approach described in the literature is the expression in plants of heterologous proteins 
known to be involved in homologous recombination. Overproduction of the bacterial 
resolvase RuvC was shown to increase somatic inter-and intra-chromosomal recombination, 
as well as extcachromosomal recombination (Shalev et al 1999), but no gene targeting 
studies were reported yet with this system. Expression of the bacterial RecA protein had 
similar effects (Reiss et al. 1996, Reiss et al. 1997), but subsequent experiments did not 
show an increase of gene targeting events (Reiss et al. 2000). So far, it is not clear whether 
heterologous proteins can successfully interact with the plant recombination machinery to 
affect the outcome of the recombination events required for gene targeting. In addition, 
these foreign proteins might have undesired side effects in plants. 

An alternative approach is to rely on endogenous plant genes to influence the frequency of 
homologous recombination events. So far, Indirect approaches have been reported to isolate 
plant genes involved in recombination. The cloning of plant orthologs to recombination and 
repair genes from other species was reported (Klimyuk & Jones 1997, Doutriaux et al. 1998, 
Hartung & Puchta 1999, Gallego et al. 2000, Lin et al. 2000), but so far the importance of 
these genes for recombination in plants has only been evaluated for the RAD50 homologue 
(Gherbi et a., 2001). Functional screens have been carried out to identify plant mutants 
hypersensitive to genotoxic treatments (Davies et al. 1994, Jenkins et al. 1995, Jiang et al. 
1997, Masson et al. 1997, Albinsky et al. 1999, Mengiste et al. 1999). Since recombination is 
an important mechanism for DNA repair, some of these mutants might be affected in their 
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recombination behavior. This was experimentally demonstrated for some X-ray 
hypersensitive Arabidopsis mutants that also showed reduced levels of somatic 
recombination (Masson & Paszkowski 1997), although the affected gene has not been 
isolated. Recently, a DNA damage hypersensitive Arabidopsis mutant was isolated from a T- 
DNA tagged population, the affected gene (MIM) was cloned and shown to encode an SMC 
(Structural Maintenance of Chromatin) protein. Since the mfm mutant showed decreased 
frequencies of somatic recombination, MIM seems be involved in some aspect of somatic 
recombination (Mengiste et al. 1 999). Also in tobacco a hyperrecombinogenic mutant was 
isolated (Gorbunova et al. 2000). However, the gene affected could not be isolated so far. 

Previously, a genetic system was described to study somatic homologous recombination 
between repeated sequences in whole plants (Swoboda et al. 1994, Puchta et al. 1995a, 
Puchta et al. 1995b). Briefly, a transgene carrying two non-functional halves of the |3- 
glucuronidase reporter gene sharing a stretch of sequence identity serves as a reporter 
construct. Homologous recombination between the repeated sequences results in the 
restoration of a functional reporter gene. Such events were detected by a sensitive 
hfstochem/cal assay, and confirmed by Southern blotting. This assay is destructive, since the 
staining procedure is lethal, so that direct isolation of mutants is difficult. 

There is a need in the art to identify genes that increase somatic recombination and this 
invention meets that need. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts an alignment of an Atlno80 sequence (SEQ ID NO:1) and public sequence, 
At3g57300 (SEQ ID NO:3), showing a splicing difference ("Query" refers to Atlno80 
sequence;J[Sbjct" to public database sequence, gi|1 841 0689|ref|NM_1 15590.1 1 
(AGI:At3g57300) 

SUMMARY OF THE INVENTION 

The present invention provides an isolated nucleic acid, in particular DNA, comprising a 
sequence having 98.5% or more identity with the sequence depicted in SEQ ID NO: 1 
(Atlno80). Also provided are vectors and host cells comprising the nucleic acids of the 
invention, as well as polypeptides encoded by the nucleic acids. 

In a further aspect of the invention, a method for inducing homologous recombination in a 
cell is provided, comprising modulating the expression or properties of one or more gene 
products selected from the group consisting of Atlno80 (SEQ ID NOs:1 and 2), At3g57300 
(SEQ ID NO:3), Rvb1 (At5g22330; SEQ ID Nos: 4 and 5), Rvb21 (At5g67630; SEQ ID NOs: 
6 and 7), Rvb22 (At3g49830: SEQ ID NOs: 8 and 9), At3g57290 (SEQ ID NO: 10), AtArp5.1 
(At3g12380; SEQ ID NOs: 11 and 12), AtArp5.2 (At5g56180; SEQ ID NOs: 13 and 14), 
AtArp5.3 (At3g60830; SEQ ID Ns: 15 and 16 ) and AtArp8 (At5g43500; SEQ ID Nos: 17 and 
18), their homologues, fragments or derivatives. In one embodiment, modulation is achieved 
by increasing expression of the gene product, such as by introducing a nucleic acid encoding 
the gene product into the cell operably linked to a promoter; and allowing transcription and 
translation of the gene in an amount sufficient to affect homologous recombination in said 
cell. 

The method can be used to increase somatic homologous recombination and/or meiotic 
homologous recombination. The promoter can be an inducible promoter, a tissue-specific 
promoter, a constitutive promoter or a meiosis-specific promoter, depending on the desired 
effect. 
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Also provided is a method of increasing gene targeting to a desired locus in a host cell 
comprising introducing a desired gene into a host cell, modulating the expression or 
properties of one or more gene products selected from the group consisting of Atlno80, 
At3g57300, Rvb1, Rvb21, Rvb22, At3g57290, AtArp5.1, AtArp5.2, AtArp5.3 and AtArp8, or 
functional fragments, derivatives and homologues thereof in the host cell, and detecting 
integration of the desired gene at a selected locus in the genome of the host cell. 

DETAILED DESCRIPTION OF THE INVENTION 

The present inventors have used a direct screening approach to identify mutants of 
Arabidopsis thaliana showing increased frequencies of somatic recombination, by visualizing 
recombination events in living plants from a mutagenized population and directly isolating 
plants with the desired phenotype. The description below describes a genetic screen and an 
Arabidopsis mutant sm22 derived from it, and the associated plant genes responsible for the 
altered recombination phenotype. 

Existing technologies for gene targeting in plants are very inefficient. The modulation of the 
expression or properties of one or more gene products selected from the group consisting of 
Atlno80 (SEQ ID NOs:1 and 2), At3g57300 (SEQ ID NO:3), Rvb1 (At5g22330; SEQ ID Nos: 
4 and 5), Rvb2(1 and 2; also referred to herein as Rvb21 , At5g67630; SEQ ID NOs: 6 and 7 
and Rvb22, At3g49830: SEQ ID NOs: 8 and 9, respectively), At3g57290 (SEQ ID NO: 10), 
AtArp5.1 (At3g12380; SEQ ID NOs: 11 and 12), AtArp5.2 (At5g56180; SEQ ID NOs: 13 and 
14), AtArp5.3 (At3g60830; SEQ ID Ns: 15 and 16 ) and AtArp8 (At5g43500; SEQ ID Nos: 17 
and 18), their homologues (including orthologs), fragments or derivatives increases the 
efficiency of gene targeting events and facilitates the routine manipulation of the genome of 
higher plants by homologous recombination. For the purposes of this disclosure, to avoid 
repetition, reference to the above group of gene products is meant to include reference to 
each gene individually, i.e., the modulation of the expression or properties of Atlno80, the 
modulation of the expression or properties of At3g57300, and so on. 

An in vivo screen for Arabidopsis mutants has been devised to allow direct detection of 
mutants with altered recombination. As a result of the screen, mutant plants with a more 
than 10-fold increased or altered frequency of somatic recombination events are provided, 
as well as the plant gene Atlno80. One or more of AtlnoSO, At3g57300, Rvb1, Rvb21, 
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Rvb22, At3g57290, AtArp5.1, AtArp5.2, AtArp5.3 and AtArp8 or orthologs from other plant 
species are affected in these mutant plants. The screen allows the identification of mutant 
plants, and plant genes with a strong effect on recombination having little or no undesired 
side effects on the plant. An increase in homologous recombination frequency is useful to 
achieve an increased efficiency of gene targeting in plants. 

Within the context of the present invention reference to a gene is to be understood as 
reference to a DNA coding sequence associated with regulatory sequences, which allow 
transcription of the coding sequence into RNA such as mRNA, rRNA, tRNA, snRNA, sense 
RNA or antisense RNA. Examples of regulatory sequences are promoter sequences, 5' and 
3' untranslated sequences, introns, and termination sequences. 

A promoter is understood to be a DNA sequence initiating transcription of an associated 
DNA sequence, and may also include elements that act as regulators of gene expression 
such as activators, enhancers, or repressors. 

Expression of a gene refers to its transcription into RNA or its transcription and subsequent 
translation into protein within a living cell. In the case of antisense constructs expression 
refers to the transcription of the antisense DNA only. 

The term transformation of cells designates the introduction of nucleic acid into a host cell, 
particularly the stable integration of a DNA molecule into the genome of said cell. 
Any part or piece of a specific nucleotide or amino acid sequence is referred to as a 
component sequence or fragment . 

In one aspect of the invention, nucleic acids and polypeptides are provided that can 
modulate homologous recombination. A nucleic acid according to the present invention 
comprises a sequence having 98.5%, 99%, 99.5% or more identity with the sequences 
depicted in SEQ ID NO:1. The nucleic acid can be DNA or RNA, such as, mRNA, rRNA, 
tRNA, snRNA, sense RNA or antisense RNA. Also provided is a vector comprising the 
nucleic acid of the invention, as well as host cells comprising the vector or nucleic acid of the 
invention. Suitable vectors and host cells are described in more detail below. Also provided 
are polypeptides encoded by the nucleic acids of the invention. 



WO 2004/003013 



-12- 



PCT/EP2003/006757 



In a further aspect of the invention, methods for increasing homologous recombination are 
provided by modulating the expression or properties of one or more gene products selected 
from the group consisting of At1no80, At3g57300, Rvb1, Rvb21, Rvb22, At3g57290, 
AtArp5.1, AtArp5.2, AtArp5.3 and AtArp8. In order to increase homologous recombination 
several methods are useful depending on the gene and the gene targeting technique 
employed. Typically, modulation will mean increasing the activity of the gene product, which 
can easily be achieved by methods known in the art. 

In one embodiment, the desired gene is overexpressed in a host cell in an amount sufficient 
to increase homologous recombination in the host cell. By "overexpression", it is meant 
increasing the amount of desired gene product in a host cell, compared to untreated cells. A 
simple way to achieve overexpression is to produce transgenic host cells, in particular 
transgenic plants, carrying a construct (vector) that ectopically overexpresses the sequence 
of interest under the control of a suitable promoter, such as the 35S CaMV, MAS 
(mannopine synthase) or ubiquitin promoter. 

In another embodiment, an inducible promoter is used to allow an increase in homologous 
recombination frequency at the time and place needed, for example, for gene targeting. 

Alternatively, the construct increasing recombination can be provided at the same time as 
the targeting construct by co-transformation, the effect is then achieved by the transient 
expression of the construct containing the said genes. 

Functional fragments, homologues (incfuding orthofogs) or derivatives can be easily 
identified by alignment with the sequences referred to above. In general two approaches 
exist to sequence alignment. Algorithms as proposed by Needleman & Wunsch and by 
Sellers align the entire length of two sequences providing a global alignment of the 
sequences. The Smith-Waterman algorithm on the other hand yields local alignments. A 
local alignment aligns the pair of regions within the sequences that are most similar given 
the choice of scoring matrix and gap penalties. This allows a database search to focus on 
the most highly conserved regions of the sequences. It also allows similar domains within 
sequences to be identified. To speed up alignments using the Smith-Waterman algorithm 
both BLAST (Basic Local Alignment Search Tool) and FASTA place additional restrictions on 
the alignments. 
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Within the context of the present invention alignments are conveniently performed using 
BLAST, a set of similarity search programs designed to explore all of the available sequence 
databases regardless of whether the query is protein or DNA. Version BLAST 2.0 (Gapped 
BLAST) of this search tool has been made publicly available on the internet (currently 
http://\AAWw.ncbi.nJm.nih.gov/BLAST/). It uses a heuristic algorithm which seeks local as 
opposed to global alignments and is therefore able to detect relationships among sequences 
which share only isolated regions. The scores assigned in a BLAST search have a well- 
defined statistical interpretation. Particularly useful within the scope of the present invention 
are the blastp program allowing for the introduction of gaps in the local sequence alignments 
and the PSI-BLAST program, both programs comparing an amino acid query sequence 
against a protein sequence database, as well as a blastp variant program allowing local 
alignment of two sequences only. Said programs are preferably run with optional parameters 
set to the default values. 

Sequence alignments using BLAST can also take into account whether the substitution of 
one amino acid for another is likely to conserve the physical and chemical properties 
necessary to maintain the structure and function of the protein or is more likely to disrupt 
essential structural and functional features of a protein. Such sequence similarity is 
quantified in terms of a percentage of "positive" amino acids, as compared to the percentage 
of identical amino acids and can help assigning a protein to the correct protein family in 
border-line cases. 

Specific examples of DNA and encoded proteins according to the present invention are 
described in SEQ ID NOS: 1-18. The putative ATPase/helicase Atlno80 may, as 
demonstrated in yeast, be part of a complex containing one or more of Rvb1, Rvb2, Arp5 
and Arp8. All these proteins may be useful in increasing homologous recombination 
frequency. 

Typically, functional fragments or derivatives are characterized by an amino acid sequence 
comprising a component sequence of at least 150 amino acid residues having 40% or more 
identity with an aligned component sequence of the one or more of the polypeptides 
encoded by the DNA of SEQ ID NOs: 1, 3, 4, 6, 8, 10, 11, 13, 15, 16 or 1 8. Preferably the 
amino acid sequence identity is higher than 50% or even higher than 55%. Most preferably 
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the protein sequence is that of SEQ ID NO:2. 

DNA encoding proteins according to the present invention can be isolated from 
monocotyledonous and dicotyledonous plants. Preferred sources are corn, sugarbeet, 
sunflower, winter oilseed rape, soybean, cotton, wheat, rice, potato, broccoli, cauliflower, 
cabbage, cucumber, sweet corn, daikon, garden beans, lettuce, melon, pepper, squash, 
tomato, or watermelon. However, they can also be isolated from mammalian sources such 
as mouse or human tissues. The following general method, can be used, which the person 
skilled in the art knows to adapt to the specific task. A single stranded fragment of the 
desired gene consisting of at least 15, preferably 20 to 30 or even more than 100 
consecutive nucleotides is used as a probe to screen a DNA library for clones hybridizing to 
said fragment. The factors to be observed for hybridization are described in Sambrook et al, 
Molecular cloning: A laboratory manual, Cold Spring Harbor Laboratory Press, chapters 
9.47-9.57 and 11.45-11.49, 1989. Hybridizing clones are sequenced and DNA of clones 
comprising a complete coding region encoding a protein characterized by an amino acid 
sequence comprising a component sequence of at least 150 amino acid residues having 
40% or wore sequence identity to the protein sequence encoded by the desired gene is 
purified. Said DNA can then be further processed by a number of routine recombinant DNA 
techniques such as restriction enzyme digestion, ligation, or polymerase chain reaction 
analysis. The disclosure of the nucleotide sequences in SEQ ID NOs: 1, 3, 4, 6, 8, 10, 11, 
13, 15, 16 and 18 enables a person skilled in the art to design oligonucleotides for 
polymerase chain reactions which attempt to amplify DNA fragments from templates 
comprising a sequence of nucleotides characterized by any continuous sequence of 15 and 
preferably 20 to 30 or more basepairs of the desired gene. 

Suitable vectors for practicing the methods of the invention are well known in the art. 
Similalry, host cells can be derived from monocotyledonous or dicotyledonous plants. 
Preferred sources are corn, sugarbeet, sunflower, winter oilseed rape, soybean, cotton, 
wheat, rice, potato, broccoli, cauliflower, cabbage, cucumber, sweet corn, daikon, garden 
beans, lettuce, melon, pepper, squash, tomato, or watermelon. However, host cells can also 
be isolated from other sources, including mammalian sources such as mouse or human 
cells, in particular stem cells. It is preferred that mammalian homologues are used in 
mammalian cells. 
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The methods for increasing homologous recombination are useful to obtain gene targeting 
so that a gene of interest is introduce into the genome at a desired locus, instead of 
randomly. For some hosts, in particular crop plants, the gene is preferably expressed in a 
selected tissue where expression is needed. This is easily achieved by the use of tissue 
specific promoter. Thus, the present invention provides a method for increasing somatic 
homologous recombination and increasing gene targeting by modulating the expression or 
properties of one or more gene products selected from the group consisting of Atlno80, 
At3g57300, Rvb1, Rvb21, Rvb22, At3g57290, AtArp5.1, AtArp5.2 f AtArp5.3 and AtArp8 and 
fragments, derivatives and homologues thereof, essentially as described above. As is 
apparent to one of ordinary skill in the art, the corresponding ortholog is preferably used for 
any particular plant. For example, the corn ortholog of Ino80 is used (or modulated) to 
increase recombination in corn. 

The methods are also useful to improve meiotic recombination, thereby facilitating breeding 
of species, in which genes encoding a particular phenotype are transferred between plants. 
Crossing in an interesting trait from another variety or species into a given variety by 
conventional breeding is a very time and labour-intensive process. Several generations of 
back-crosses have to be carried out to eliminate the undesired genetic material of the donor 
species, while maintaining the desired phenotype or trait. Using the methods described 
above for increasing homologous recombination, meiotic recombination frequencies can be 
increased, preferably by expressing the desired gene under the control of a meiosis-specific 
promoter or inducible promoter, the breeding process is speeded up. Thus, the present 
invention provides a method for increasing meiotic recombination by modulating the 
expression or properties of one or more gene products selected from the group consisting of 
Atlno80, At3g57300, Rvb1, Rvb21, Rvb22, At3g57290, AtArp5.1, AtArp5.2, AtArp5.3 and 
AtArp8 and fragments, derivatives and homologues thereof, essentially as described above. 

The Examples below are provided for illustrative purposes and are in no way intended to be 
limiting to the invention. 
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EXAMPLES : 

Example 1 : Identification of sm22 gene effective in increasing homologous gene 
recombination. 

We have used for our screening a newly constructed a transgenic Arabidopsis thaliana line 
that carries a recombination reporter construct based on the firefly luciferase gene. The 
structure of the reporter construct - two segments of the luciferase gene arranged as 
inverted repeats - is comparable to that of the previously described beta-glucuronidase 
reporter (Swoboda et al. 1994, Puchta et al. 1995a, Puchta et al. 1995b), but offers the 
advantage that recombination events can be detected in living plants. Luciferase activity in 
cells in which recombination has restored an intact luciferase gene can be detected by light 
emission after application of the substrate D-luciferin using a high-sensitivity CCD camera 
(Millar et al. 1992, Millar et al. 1995a, Millar et al. 1995b, Michelet & Chua 1996). 

To induce hyperrecombination mutations in the luciferase recombination reporter line, we 
used T-DNA activation tagging with a mutagenic construct (pAC102). "Activation tagging" 
refers to the transcriptional activation of endogenous plant genes by random integration of a 
construct that carries promoter or enhancer sequences. One published approach for 
"activation tagging" is the introduction, via Agrobacterium-mediated gene transfer, of a T- 
DNA carrying severa) copies of the cauliflower mosaic virus (CaMV) 35S enhancer (Fang et 
al. 1989), which can activate the expression of heterologous genes over a distance (Hayashi 
et al. 1992, Walden et al. 1994, Kakimoto 1996, Kardailsky et al. 1999, Weigel et al. 2000). 
Another published approach is the introduction of a complete, outward-pointing CaMV 35S 
promoter on a transposable Ds element (Wilson et al. 1996, Schaffer et al. 1998, Fridborg et 
al. 1999). The construct "pAC102" used for our experiments is a combination of these 
previously described elements: it is a binary vector carrying a T-DNA that can be transferred 
to plants that contains a complete, outward-pointing copy of the CaMV 35S 
promoter/enhancer close to the right T-DNA border. Thus, this construct combines the ease 
of application of T-DNA gene transfer with the genetic ability of a complete promoter, 
avoiding some of the drawbacks of enhancer-only constructs (Weigel et al. 2000). 

In principle, the activation tagging construct can cause several kinds of mutations after 
integration in the plant genome: gene disruption by insertion within a coding sequence, 
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activation of plant gene expression by action of the CaMV 35S enhancer, direct expression 
of a plant gene from the CaMV 35S promoter on the T-DNA, or down-regulation of 
expression by antisense RNA production driven from the CaMV 35S promoter. The pAC102 
T-DNA carries in addition to the 35S promoter a complete copy of the pUC cloning vector to 
facilitate gene cloning by plasmid rescue (Dilkes & Feldmann 1998), and a sulfonamide 
resistance marker (Guerineau et al. 1990, Reiss et al. 1996) for selection of transgenic 
plants. 

We transformed 13.000 three-week old Arabidopsis ecotype Columbia plants from the 
lucif erase recombination reporter line with the activation-tagging T-DNA construct "pAC102" 
by Agrobacterium-mediated gene-transfer, using the established 'floral dip" procedure 
(Clough & Bent 1 998) with a modified infiltration buffer, in which the Silwet L-77 detergent 
was replaced by 0.05% Extravon® (Ciba). Seeds from the infiltrated plants were harvested 
three weeks after infiltration. Transgenic progeny carrying the pAC102 activation tagging T- 
DNA were selected by sowing seeds on perlite substrate drenched with Gamborg B5 mineral 
medium (Gamborg et al. 1968) containing 10 mg/l sulfadiazine (Sigma), and transferring 
surviving individuals after 10 days to soil. About 20,000 sulfonamide-resistant plants were 
isolated; they represent independent transformants with the pAC102 T-DNA activation 
tagging construct integrated at different random positions in the Arabidopsis genome. 

When individual plants had grown to the 10-leaf stage, they were assayed for lucif erase 
activity to detect somatic recombination events. Batches of 25 plants were sprayed with the 
substrate D-luciferin and pictures (typically two) were taken with a "Astrocam" (Gloor 
Instruments, (Jster) by integrating photons over 15 min. Background noise and cosmic 
radiation was filtered out by correlating both images using the minimum function. Plants 
showing an increased number of sectors with Juciferase activity relative to the average of the 
population were observed with a frequency of about 1 in 500 plants. 

As a result of the screen, a hyperrecombination mutant plant was isolated called sm22. The 
original hyper-recombination phenotype of sm22 plant shows an enhancement of about 20- 
to 50- fold for homologous recombination in the reporter line. No other obvious phenotype 
was seen and the seed yield was normal. Sulfonamide selection in the second generation 
(T2) revealed a 2/1 or 3/1 segregation of resistant seedlings, thus showing that there is only 
1 locus (or 2 closely related loci) with an active T-DNA inserted. However, the T2 
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recombination phenotype was even lower (less or same number of recombination events per 
plant) than in the wild type. 

After Hindlll digestion of T1 callus genomic DNA prepared essentially according to the 
method of the Nucleon Phytopure protocol and Plant DNA extraction kit (Amersham), 
plasmid rescue was applied (Dilkes & FeWmann 1998, Mathur et aL 1998), which gave rise 
to two independent junction fragments. Briefly, we digested plant genomic DNA with HinDIII, 
circularized the resulting fragments by ligation at low DNA concentration, and transformed 
the ligation mixture into competent E. coli TOP10 cells (commercially available from 
INVITROGEN) by electroporation. Since the Hindlll fragments containing the fusion joint 
between plant DNA and the right end of the activation tagging construct carry a plasmid 
origin and the ampicillin resistance gene (bla) contributed by pAC102, circularization of such 
fragments will result in a functional bacterial plasmid and confer ampicillin resistance to the 
E. coli cells. 

Several colonies were obtained after plating the transformed bacteria on selection medium 
containing ampicillin. Plasmid DNA of these transformants was prepared and characterized 
by restriction analysis.To determine the nature of the plant sequences joined to the right end 
of the T-DNA, the plant DNA insert from these rescued plasmids was sequenced from both 
sides, using one custom sequencing primer complementary to the T-DNA right end reading 
towards the plant DNA, and the standard M13 reverse sequencing primer, reading from the 
pAC102 vector sequences into the plant DNA insert from the other end. The obtained DNA 
sequences were compared to the GenBank nucleotide database using the BLASTN search 
program. 

Two insertions were identified. The first one corresponds to a single T-DNA insertion without 
deletion (left border, LB, junction sequenced) in the N-terminal region of a putative 
ATPase/helicase gene, in antisense orientation. The second T-DNA inserted in a gene with 
no obvious relationship to homologous recombination (gb AF082176_1) and does not confer 
sulfonamide resistance. Six (T3) resistant families were analysed by PCR and Southern. 
Only one family contained some plants with the second insertion whereas all families have 
the helicase insertion site. 
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In subsequent generations, homozygous plants for the helicase insertion site were obtained. 
The homologous recombination frequency of heterozygous and homozygous plants for this 
insertion site was at least 50% and 15%, respectively, and up to 80% and 20%, respectively, 
of the wild type level. 

The complete cDNA (4.8kb) was cloned in two steps. First, a public EST containing the 3' 
part was sequenced. Then the 5' part of the cDNA was amplified by RT-PCR on Col-0 
(Arabidopsis Columbia ecotype, wild type) callus RNA (prepared with the Qiagen RNAeasy 
Plant Kit), using primers in the 5' untranslated region including a stop codon in frame with 
the predicted ATG (smSUT) to make sure that the complete 5' part of the cDNA was 
amplified. The primer sequences were smSUT: ctagaagcttttaaggatTA4gactctcc (SEQ ID 
NO:19) and for 3' primer: ctcgtatgtatcccccttctcc (SEQ ID NO:20). The coding sequence is 
provided as SEQ ID NO.1 and is similar to but not identical to At3g57300 (see Fig.1). 

The predicted helicase gene (8kb genomic DNA) has about 23 exons encoding a protein of 
1507 amino acids (SEQ ID NO:2). It is predicted to be an ATPase of the Swi2/Snf2 family, 
and contains several nuclear localization signals (NLS), The ATPase/helicase gene is the 
putative Arabidopsis ortholog of the yeast lno80p/YGL150c protein (Ebbert et al. (1999), 
Shen et al. (2000)). Homologs exist in yeast, budding yeast, Drosophila and human. These 
four homologues have several highly conserved regions including the six motifs of the 
SWI2/SNF2 helicase domain. Several NLS suggest a nuclear localization of the gene 
product. 

In the sm22 heterozygous and homozygous mutant, the level of Ino80 transcript is 
respectively about 50% of the wild type situation or absent, as measured by semi- 
quantitative RT-PCR (5* At3g57300 primer: TGATGGATCTATCACCATCAG, SEQ ID 
NO:21 ; 3' At3g57300 primer: ggtgggattccaatcactttc, SEQ ID NO:22) and by northern blot 
hybridization. For this, the RNA was extracted from 2 weeks old seedlings using the 
QIAGEN RNAeasy plant extraction kit following the manufacturer's instructions. Together 
with the decrease of homologous recombination in sm22 plants described above (50% in 
heterozygous plants, 15% in homozygous mutant), this result shows that the level of Ino80 
gene product might positively and directly regulate the homologous recombination 
frequency, making this gene a choice candidate to positively regulate homologous 
recombination. 



WO 2004/003013 



-20- 



PCT/EP2003/006757 



Results with a recombination reporter line 1445 (Qherbi et al. 2001 ) overexpressing the 
INO80 cDNA under the control of the 35S promoter and with an N-terminal HA-tag 
interrupted by an intron show upregulation of homologous recombination providing evidence 
that INO80 upregulates homologous recombination. 

The yeast homolog (Ebbert ef a/.,1999), INO80(=YGL150c) is part of a big complex >1MDa 
(monomeric form is 171KDa), containing two essential helicases Rvblp and Rvb2p and actin 
related proteins (arp) Arp4, Arp5 and Arp8 (Cho et al. 2001 ; Jonsson et al. 2001; Wood et al. 
2000). Thus, the involvement of Ino80 in homologous recombination implicates the activity of 
these other genes in homologous recombination. In eukaryotes Human Rvblp and Rvb2p 
are also known (Kanemaki 1999, Ikura et al. 2000, Shen et al. 2000). 

In Arabidopsis thaliana we found three genes closely related to Rvbs from other organisms, 
AtRvbl (SEQ ID NO:4, SEQ ID NO:5), AtRvb21 (SEQ ID NO: 6, SEQ ID NO:7) and 
AtRvb22 (SEQ ID NO:8, SEQ ID NO:9). The three genes are expressed (RT-PCR) and 
some of them are positively regulated by genotoxic stress (UVc, bleomycin). For treatment 
with Bleomycin (BLM) 2 week-old Arabidopsis seedlings were placed under sterile conditions 
in liquid GM medium containing 10" 6 M of BLM (Sigma) or 100 ppm of MMS (Fluka, 
Switzerland). For UV-C irradiation (6000 ergs) 2 week-old seedlings were irradiated with light 
provided by a HNS 55W OFR lamp (Osram). After treatment, plants were harvested at 
several time points (30min, 1h, 4h and 12h) and RNA extracted as described above. Then 
semi-quantitative RT-PCR analysis was performed with the following primers Atlno80 
(TG ATGG ATCTATCACCATCAG , SEQ ID NO:23; ggtgggattccaatcactttc, SEQ ID NO:24) 
AtRvbl (tttgatgggccaaatgatg, SEQ ID NO:25; cttccaaCCTAGGtgagatgtttcaacaaaatgtgc, 
SEQ ID NO:26) AtRvb21 (tcaacagcaggacacaagg, SEQ JD NO;27; 
cccaatgCCTAGGaaatccgagttcaacatcctaatc, SEQ ID NO:28) AtRvb22 
(acaaaccagatatcagcacatgg, SEQ ID NO:29; aacaagtactcgctctcatgctc, SEQ ID NO:30). 

To characterize further the Atino80--\ HR deficiency, we subjected the mutant to various 
genotoxic stresses. In parallel with the original ino80 mutant we also tested two allelic 
mutants of INO80 from the publicly available SAIL mutant collection. Neither bleomycin nor 
Mitomycin-C or UV-C sensitivity was shifted rn the Atino80-1 mutants, in any of the various 
conditions tested. All the Atino80-1 alleles seem to be slightly hypersensitive to MMS (methyl 
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methanesulfonate), which is also known to induce DNA double-strand breaks. The difference 
in sensitivity is visible at 60 and 80ppm of MMS on root elongation and rosette growth. Most 
mutations affecting DNA repair or recombination also give rise to changes in the sensitivity 
to DNA damaging agents. We challenged AtfnoSCM mutant plants with various treatments 
known to induce DNA damage and recombination. None of the tested agents (UV-C, 
bleomycin, Mitomycfn-C and MMS) gave rise to a shift in sensitivity, with the exception of a 
slight change for MMS. This suggests a difference for the role of INO80 in plants compared 
to yeast (Ebbert et al., 1999; Shen et al. 2000) and supports the use of At)NO80 to regulate 
homologous recombination without affecting the major repair pathway in plants. 

In the sm22 background the steady state level of AtRvb21 and AtRvb22 was shown to be 
down-regulated using RT-PCR on RNA extracted as above mentioned. 
This indicates that the components of the putative Arabidopsis INO80 complex show co- 
regulation at the transcriptional level, supporting the use of Arabidopsis Rvb1, Rvb21 and 
Rvb22 and the Arabidopsis Arp protein orthologs to manipulate homologous recombination 
frequency in plants. 

Example 2: AtRvbl as positive regulator of homologous recombination. 

As describe above (Example 1), the original recombination-up phenotype found in sm22 can 
be associated with an effect mediated by the Arabidopsis Rvb1 and 2 orthologs. Thus, 
AtRvbl can be used as a positive regulator of homologous recombination. 

Example 3: AtRvb21 as positive regulator of homologous recombination. 

As describe above (Example 1), the original recombination-up phenotype found in sm22 can 
be associated with an effect mediated by the Arabidopsis Rvb1 and 2 orthologs. Thus, 
AtRvb21 can be used as a positive regulator of homologous recombination. 

Example 4: AtRvb22 as positive regulator of homologous recombination. 

As describe above (Example 1), the original recombination-up phenotype found in sm22 can 
be associated with an effect mediated by the Arabidopsis Rvb1 and 2 orthologs. Thus, 
AtRvb22 can be used as a positive regulator of homologous recombination. 

Example 5: At3g57290 as positive regulator of homologous recombination. 
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In the sm22 mutant (Example 5), the At3g57290p gene is potentially overexpressed by the 
35S Enhancer/promoter. Over expression of this gene in the sm22 context or directly with a 
35S promoter can be carried out to reproduce the original recombination-up phenotype. The 
phenotype was lost in the second generation (Example 1), at which point At3g57290 is not 
overexpressed any longer allowing a temporal ability to modulate homologous 
recombination. 

Example 6: AtArp as positive regulators of homologous recombination. 

As describe above (Example 1), the original recombination-up phenotype found in sm22 can 
be associated wfth an effect mediated by other components of the Arabidopsis INO80 
complex, such as the Arp homolog AtArp5.1, AtArp5.2, AtArp5.3 and/or AtArp8. Any of 
these Arp hmologues can be used as a positive regulator of homologous recombination, 
alone or in combination. 

All publications referred to herein as well as the disclosure of GB patent application 
0214896.3 are incorporated by reference as if each is referred to individually. 



